D2P2: database of disordered protein predictions
نویسندگان
چکیده
We present the Database of Disordered Protein Prediction (D(2)P(2)), available at http://d2p2.pro (including website source code). A battery of disorder predictors and their variants, VL-XT, VSL2b, PrDOS, PV2, Espritz and IUPred, were run on all protein sequences from 1765 complete proteomes (to be updated as more genomes are completed). Integrated with these results are all of the predicted (mostly structured) SCOP domains using the SUPERFAMILY predictor. These disorder/structure annotations together enable comparison of the disorder predictors with each other and examination of the overlap between disordered predictions and SCOP domains on a large scale. D(2)P(2) will increase our understanding of the interplay between disorder and structure, the genomic distribution of disorder, and its evolutionary history. The parsed data are made available in a unified format for download as flat files or SQL tables either by genome, by predictor, or for the complete set. An interactive website provides a graphical view of each protein annotated with the SCOP domains and disordered regions from all predictors overlaid (or shown as a consensus). There are statistics and tools for browsing and comparing genomes and their disorder within the context of their position on the tree of life.
منابع مشابه
Distribution and cluster analysis of predicted intrinsically disordered protein Pfam domains
The Pfam database groups regions of proteins by how well hidden Markov models (HMMs) can be trained to recognize similarities among them. Conservation pressure is probably in play here. The Pfam seed training set includes sequence and structure information, being drawn largely from the PDB. A long standing hypothesis among intrinsically disordered protein (IDP) investigators has held that conse...
متن کاملFunctional anthology of intrinsic disorder. 1. Biological processes and functions of proteins with long disordered regions.
Identifying relationships between function, amino acid sequence, and protein structure represents a major challenge. In this study, we propose a bioinformatics approach that identifies functional keywords in the Swiss-Prot database that correlate with intrinsic disorder. A statistical evaluation is employed to rank the significance of these correlations. Protein sequence data redundancy and the...
متن کاملFunctional Anthology of Intrinsic Disorder. I. Biological Processes and Functions of Proteins with Long Disordered Regions
Identifying relationships between function, amino acid sequence and protein structure represents a major challenge. In this study we propose a bioinformatics approach that identifies functional keywords in the Swiss-Prot database that correlate with intrinsic disorder. A statistical evaluation is employed to rank the significance of these correlations. Protein sequence data redundancy and the r...
متن کاملThe GTOP database in 2009: updated content and novel features to expand and deepen insights into protein structures and functions
The Genomes TO Protein Structures and Functions (GTOP) database (http://spock.genes.nig.ac.jp/~genome/gtop.html) freely provides an extensive collection of information on protein structures and functions obtained by application of various computational tools to the amino acid sequences of entirely sequenced genomes. GTOP contains annotations of 3D structures, protein families, functions, and ot...
متن کاملMeta-Analysis of Methane Mitigation Strategies: Improved Predictions of Mitigation Potentials and Production Implications
The aim of this study was to use meta-analysis to identify the enteric methane (CH4) mitigation strategy that reduced CH4 emission without lowering production. To this end, a database initially developed was updated, compiling data from 61 publications (233 experiments) for various observations in dairy cattle on effects of hydrogen sink (H-sink), ionophore, lipid and conc...
متن کامل